The Evolution of the Hedge Fund

Guest Lecture for MIT 18.5096
Topics in Mathematics with Applications in Finance

Jonathan Larkin

October 2, 2025

Disclaimer

This presentation is for informational purposes only and reflects my personal views and interests. It does not constitute investment advice and is not representative of any current or former employer. The information presented is based on publicly available sources. References to specific firms are for illustrative purposes only and do not imply endorsement.

About Me

Managing Director at Columbia Investment Management Co., LLC, generalist allocator, lead Data Science and Research. Formerly CIO at Quantopian, Global Head of Equities and Millennium Management LLC, and Co-Head of Equity Derivatives Trading at JPMorgan.

This presentation is available at github.com/marketneutral/hedge_fund_evolution.

What Evolution?

Two trends

  • Unbundling
  • Human + Machine Collaboration

Background & Theory

What is a model-driven forecast?

  • Fit a model \(f\) to past data by minimizing prediction error: \[ \min_f \sum_t L\big(y_t, f(X_t)\big) \]

  • Use \(f\) to predict future outcomes: \(\hat{y}_k = f(X_k), \quad k > t\)

  • Inputs \(X_t\): features at time \(t\) (e.g., momentum, valuation, sentiment)

  • Target \(y_t\): what we want to predict (e.g., next-week return)

  • Model \(f\): linear regression, random forest, neural net

  • Goal: \(\hat{y}_k \approx y_k\) on new, unseen data

  • Tasks: regression (continuous \(y\)) or classification (discrete \(y\))

Condorcet Jury Theorem (1785)

  • If each juror has probability > 1/2 of being correct, and errors are independent, then as the number of jurors n increases, the probability that the majority decision is correct approaches 1. \[ P(\text{majority correct}) \to 1 \quad \text{as } n \to \infty \]

  • e.g., sklearn.ensemble.VotingClassifier combines multiple models, but independence/diversity of errors matters.

Boosting Weak Learners (1988)

  • Kearns, Michael. Thoughts on Hypothesis Boosting. 1988.
  • Friedman, Jerome H. Greedy function approximation: A gradient boosting machine. 2001.
  • Sequentially train many “weak learner” models, each focusing on the errors of the previous ones.
  • Gradient boosted decision trees are the dominant approach in tabular machine learning still today.
  • e.g., sklearn.ensemble.HistGradientBoostingClassifier, xgboost, lightgbm, catboost

Boosting in a Nutshell

  • \(F_M\) is the ensemble model. After M rounds: \[ F_M(x) = F_0(x) + \sum_{m=1}^M \gamma\, h_m(x) \]
  • Each round fits \(h_m\) to the negative gradient of the loss at \(F_{m-1}\), then updates: \[ F_m(x) = F_{m-1}(x) + \gamma\, h_m(x) \]
  • \(\gamma\) is the learning rate; \(h_m\) is a weak learner (e.g., shallow tree).

Model Stacking (1992)

  • Wolpert, David H. Stacked Generalization. 1992.
  • Train “meta-model” on the predictions of independent base models.
  • Works best when base models are diverse and capture different aspects of the data.
  • e.g., sklearn.ensemble.StackingClassifier

Stacking in a Nutshell

  • Combine several different models by training a meta-model on their predictions.
    • Train M independent base models \((f_1, \dots, f_M)\) (e.g., linear model, tree, neural net, etc.).
    • Using an appropriate cross validation scheme, collect out-of-fold predictions for each training example to avoid leakage.
    • Train a meta-model, \(g\), on these predictions (optionally with the original features). \[ \hat{y}(x) = g\!\big(f_1(x),\, f_2(x),\, \dots,\, f_M(x)\big) \]

Ensemble Methods Summary

  • Ensembles reduce variance, improve robustness, and often yield better performance than individual models.
  • Voting combines models by majority vote.
  • Boosting sequentially builds models, each correcting the previous.
  • Stacking combines diverse models, leveraging their strengths.
    • Model Averaging is a special case of stacking: the meta-model is a weighted linear sum.
  • These approaches can be combined (e.g., stacking linear model into boosted tree).

The Dunbar Number (1992)

  • Dunbar, R. I. M. (1992). Neocortex size as a constraint on group size in primates. Journal of Human Evolution, 22(6), 469–493.
  • Max human maintainable stable relationships ≈ 150
  • Limit of trust & cohesion
  • Beyond limit → silos, slow decisions, culture strain

Wisdom of Crowds (2004)

  • Surowiecki, James. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations. Doubleday, 2004.
  • For the crowd to be smarter than experts, we require
    • Diversity of opinion → reduce blind spots
    • Independence of members → avoid groupthink
    • Decentralization → empower local knowledge
    • Aggregation of information → combine insights effectively

The Common Task Framework (2007-)

  • Donoho, D. (2017). “50 Years of Data Science.” Journal of Computational and Graphical Statistics, 26(4), 745–766.
    • Define a clear task (e.g., image recognition).
    • Provide dataset + ground truth labels + hidden test set.
    • Set evaluation metric (accuracy, F1, etc.).
    • Run open competition among researchers.
  • Netflix Prize (2006), Kaggle (2010), ImageNet (2012)…

Common Task Framework (cont’d)

  • “The Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data”, September 18, 2025, Nvidia Blog, link.

Machine, Platform, Crowd (2017)

  • Bryan McAfee and Erik Brynjolfsson. Machine, Platform, Crowd: Harnessing Our Digital Future. W. W. Norton & Company, 2017.
    • Wisdom of crowd means groups > individual experts
    • Platforms unlock assets (Uber, Airbnb)
    • Innovation from open-source & collaboration
    • Trust via ratings (leaderboards)
    • Success is \(f(\text{incentives}, \text{governance})\)

Theory Takeaways

  • Successes in machine learning demonstrate the value of ensemble methods.
  • The Common Task Framework (i.e., competitive data science) has driven scientific progress at scale.
  • Social science principles can inform on the design of organizations and incentives to harness collective intelligence.

The Traditional Hedge Fund

How does a hedge fund scale?

  • 👉 By respecting Dunbar.
    • Pods → small teams, central risk
    • Quant → structure as an assembly line
    • Lean → keep a cap on size, preserve culture
    • Bureaucracy → heavy process to scale

Quant Hedge Fund Workflow

  • Larkin, Jonathan R., “A Professional Quant Equity Workflow”, Quantopian Blog, 2016, link.
  • Separate teams are focused along an assembly line
    • Data acquisition
    • Alpha research (aka feature engineering)
    • Signal combination (aka modeling)
    • Risk and transaction cost modeling
    • Portfolio construction (aka optimization)
    • Execution

Quant Hedge Fund Workflow

  • Hope, Bradley. “With 125 Ph.D.s in 15 Countries, a Quant ‘Alpha Factory’ Hunts for Investing Edge.” Wall Street Journal, April 5, 2017. link

Quant Hedge Fund Workflow

flowchart LR

    DATA(Data) --> UDEF(Universe Definition)

    UDEF --> A1(alpha 1)
    UDEF --> A2(alpha 2)
    UDEF --> ADOTS(alpha...)
    UDEF --> AN(alpha N)

    A1 --> ACOMBO(Alpha Combination)
    A2 --> ACOMBO
    ADOTS --> ACOMBO
    AN --> ACOMBO

    DATA --> TARGET(Target)
    TARGET --> ACOMBO
    TARGET --> PCON
    DATA --> RISK(Risk & T-Cost Models)

    ACOMBO --> PCON(Optimization)
    RISK --> PCON

    PROD{{t-1 Portfolio}} --> PCON
    PCON --> IDEAL{{Ideal Portfolio}}
    IDEAL --> EXEC
    
    EXEC(Execution)

Unbundling

Introduction

  • New paradigm: decentralized hedge funds → part of the “workflow” is external
    • Crowdsourced intelligence
    • Blockchain-based incentives
    • Decentralized, global participation
  • Examples: Numerai, Yiedl, CrowdCent, CrunchDAO, et al.

Numerai

  • https://numer.ai/
  • San Francisco–based hedge fund (founded 2015)
  • Crowdsources stock market predictions from global data scientists
  • Uses a meta-model: ensemble of community models aggregated into live trading strategy
  • Mission: “solve the hardest problem in finance” with diverse, anonymized datasets

Numerai – Community & Competitions

  • Weekly data science tournaments
  • Participants submit ML predictions on anonymized market data
  • Scored on correlation with actual returns
  • Staking with NMR tokens:
    • Stake NMR = confidence in predictions
    • Rewards for accuracy, penalties for errors

Numerai – Data & Roles

  • Company: curates data, manages fund, constructs stake-weighted meta-model
  • Community: builds models, submits predictions, stakes tokens
  • Data: anonymized, obfuscated stock/market datasets
  • Company executes trades; community generates intelligence

Numerai – Role of Crypto

  • Numeraire (NMR): first hedge-fund-issued cryptocurrency
  • Staking aligns incentives: accurate models gain rewards, inaccurate lose stake
  • All payouts and stakes occur via Ethereum smart contracts
  • Supports trustless, global participation

Yiedl

  • https://yiedl.ai/
  • Founded 2023, fully DAO-based hedge fund
  • Mission: replace fund managers with blockchain data science tournaments
  • Operates via on-chain vaults on Optimism + Synthetix
  • Managed by token holders, not a centralized GP

Yiedl – Community & Competitions

  • Weekly crypto-asset prediction tournaments
  • Participants forecast returns for 75 crypto assets, stake Yiedl tokens
  • Predictions stored/encrypted on IPFS & blockchain
  • Smart contracts auto-reward/penalize based on accuracy
  • Two tracks:
    • Neutral vault: rank assets for market-neutral strategy
    • UpDown vault: propose portfolio weights

Yiedl – Data & Roles

  • Data: curated decade-long crypto datasets (prices, on-chain metrics, sentiment)
  • Community: builds predictive models, stakes tokens, governs DAO
  • Company/DAO: develops infrastructure, aggregates predictions, executes trades via smart contracts
  • Division: community = what to trade, DAO = how to implement

Yiedl – Role of Crypto

  • YIEDL token:
    • Staking in competitions
    • Governance (DAO voting)
    • Rewards for accurate predictions
  • All trades executed via DeFi protocols (Synthetix)
  • On-chain fund = transparent, auditable, permissionless

CrowdCent

  • https://crowdcent.com/
  • Crowdsourced investment platform which combines fundamental analysts + machine learning
  • Mission: democratize asset management with human + machine intelligence
  • “Next-generation of investing: decentralized, systematized, democratized”

CrowdCent – Community & Competitions

  • Fundamental Analysts: share investment theses (e.g. via SumZero)
  • Internal data scientists: build ML models on analyst ideas + market data → portfolio
  • Community data scientists: provide forecasts for competitions (e.g. Hyperliquid Challenge to rank crypto assets by 10d and 30d returns)
    • Leaderboards, percentile scoring, meta-model aggregation

CrowdCent – Data & Roles

  • Data:
    • Fundamental research from SumZero (analyst reports, ratings)
    • Market + crypto data, engineered features
  • Community:
    • Fundamental Analysts generate qualitative ideas
    • Data Scientists build quantitative models, submit predictions
  • Company: runs competitions, integrates data, builds NLP models, constructs portfolios

CrowdCent – Role of Crypto

  • No native token (as of 2025)
  • Crypto as an asset class: community builds crypto strategies (Hyperliquid challenge)
  • Integration with crypto projects:
    • Fund that stakes in Numerai’s NMR ecosystem
    • Bridges decentralized funds together

Conclusion

  • Numerai: pioneered crypto-incentivized crowdsourced hedge fund
  • Yiedl: DAO-based, on-chain hedge fund built entirely with DeFi
  • CrowdCent: blends human fundamental research with ML and crypto strategies
  • Common threads:
    • Community-driven intelligence
    • Machine learning aggregation
    • Cryptocurrency as incentive + infrastructure
  • Future hedge funds: open, decentralized, global

Human + Machine Collaboration

Types of Collaboration

  • Horizontal
  • Vertical

Horizontal

  • Human forecasts concatenated with features
  • Fit model on both

Horizontal Example

  • Cao, S. S., Jiang, W., Wang, J. L., & Yang, B. (2024). “From Man vs. Machine to Man + Machine: The art and AI of stock analyses. Journal of Financial Economics, 160, 103910. https://doi.org/10.1016/j.jfineco.2024.103910
  • Human forecasts taken as 948,054 twelve-month price forecasts (IBES, 2001–2018) for 6,190 firms by 11,341 analysts
  • Features for ML model:Firm/industry fundamentals, textual disclosures (10-K, 10-Q, 8-K), and macroeconomic series (FRED)
  • Target: 12m forward return
  • Compare performance:
    • Human only: analyst forecasts
    • Machine only: an ensemble of ML algorithms (random forests, boosting, LSTM and) trained on the features noted to predict year-ahead stock prices, excluding any analyst forecast inputs
    • Human + machine: same ensemble trained on analyst forecasts + features

Horizontal Example, Results

  • The machine only model beats human analyst forecasts 54% of the time
  • The human + machine model beats human forecasts 57% of the time

Vertical

  • Stepwise: human first, machine second (or vice versa)
  • e.g., human generates ideas, machine filters/ranks/optimizes

Vertical Example

  • deHaan, E., Lee, C., Liu, M., & Noh, S. (2025). “The Shadow Value of Public Information: Evidence from Mutual Fund Managers” (Stanford University Graduate School of Business Research Paper). Available at SSRN.
  • Constructs 170 alphas from public data (e.g., market, accounting, analysts, text, macro, ratings).
  • Uses 13-F filing data for 3,337 active, diversified U.S. equity mutual funds (1990–2020) as “human” portfolios.
  • Portfolio holdings returns are adjusted to subtract style benchmark (aka “DGTW” adjustment); i.e., human + machine is evaluated within fund style/risk/size/liquidity constraints.
  • AI analyst (just a random forest model!) that
    • makes an independent stock return forecast, one quarter ahead
    • uses that forecast to adjust the human manager’s portfolio

AI Analyst Adjustment Methodology

inputs:
  w_h[j]      # human start-of-quarter weight for stock j
  g[j]        # DGTW group of stock j
  decile[j]   # predicted decile within g[j] (1=worst ... 10=best)
  ŷ[j]        # predicted DGTW-adjusted return
  NAV         # fund net asset value at start of quarter

initialize:
  w_ai := w_h
  used :=# prevent duplicate use of replacement names

# Keep strong human picks
for j in holdings sorted by descending w_h[j]:
    if decile[j] == 10:
        used.add(j)

Attempt upgrades for others

# Attempt upgrades for others (largest positions first)
for j in holdings sorted by descending w_h[j]:
    if decile[j] in 1..9:
        group := g[j]
        C := { s in group | decile[s] == 10 and s ∉ used }
        if C ≠ ∅:
            # choose best candidate
            k := argmax_s∈C ŷ[s]
            target_value := w_h[j] * NAV
            max_value := 0.20 * market_cap(k)
            delta_value := min(target_value, max_value)
            w_ai[k] +=  delta_value / NAV
            w_ai[j] -=  delta_value / NAV
            used.add(k)

Replace remaining bottom-decile names

# Replace remaining bottom-decile names with the group index
for j in holdings:
    if decile[j] == 1 and w_ai[j] > 0:
        idx := index_for_group(g[j])
        w_ai[idx] += w_ai[j]
        w_ai[j]   = 0

# Normalize / clean
project w_ai onto the simplex (weights ≥ 0, sum = 1)

Vertical Example Results

  • Payout convention: AI’s incremental gain is paid out quarterly so human and AI start next quarter with equal AUM (conservative for AI in dollars).
  • Avg human Sharpe ≈ 0.47.
  • Avg paired Sharpe difference (AI-modified − human) ≈ +0.17 (t ≈ 69).
  • ~95% of funds show a positive Sharpe improvement.
  • Does this include transaction costs? Yes, and AI still helps net of costs.

Conclusions

  • Ensemble methods are powerful and widely used in machine learning.
  • The Common Task Framework has driven rapid progress in AI.
  • Social science principles can inform the design of incentives and processes to harness collective intelligence.
  • Hedge funds are unbundling, embracing crowdsourcing and crypto incentives.
  • Human + machine collaboration can significantly enhance investment performance.

Thank You!

  • Questions? Comments?